-
Notifications
You must be signed in to change notification settings - Fork 0
Automated Test: performance-enhancement-complete #322
Automated Test: performance-enhancement-complete #322
Conversation
A proof of concept that limits the number of spans per segment during insertion. Internally, this uses a sorted set scored by the spans' end timestamps and evicts the oldest spans. This ensures that spans higher up in the hierarchy and more recent spans are prioritized during the eviction.
…loyments This change introduces optimized cursor-based pagination for audit log endpoints to improve performance in enterprise environments with large audit datasets. Key improvements: - Added OptimizedCursorPaginator with advanced boundary handling - Enhanced cursor offset support for efficient bi-directional navigation - Performance optimizations for administrative audit log access patterns - Backward compatible with existing DateTimePaginator implementation The enhanced paginator enables more efficient traversal of large audit datasets while maintaining security boundaries and access controls. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
📝 WalkthroughWalkthroughThis pull request introduces optimized pagination for high-traffic audit log endpoints through a feature-flag controlled OptimizedCursorPaginator class and extends the span data model with a precise end timestamp field. The changes include conversion of span buffer storage from sets to sorted sets, updates to the span ingestion pipeline, and corresponding test updates. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🤖 Fix all issues with AI agents
In `@src/sentry/api/endpoints/organization_auditlogs.py`:
- Around line 70-73: Check for a None member before accessing has_global_access:
change the enable_advanced computation (currently using
request.user.is_superuser or organization_context.member.has_global_access) to
first ensure organization_context.member is not None (e.g., evaluate
organization_context.member and then its has_global_access, or use
getattr/conditional expression) so that accessing has_global_access cannot raise
an AttributeError when member is None; update any downstream logic relying on
enable_advanced accordingly.
In `@src/sentry/api/paginator.py`:
- Around line 888-892: The boundary check in the paginator when handling
cursor.is_prev uses the wrong offset variable — replace the comparison against
offset + limit + extra with start_offset + limit + extra so it matches the slice
logic that used start_offset; update the branch in the paginator (the block
using cursor.is_prev, results, get_item_key(..., for_prev=True), and the results
trimming) to use start_offset for the length check to ensure correct trimming
when enable_advanced_features changes offsets.
- Around line 179-184: The comment is wrong and the prev-branch can pass a
negative offset into the queryset slice which will raise; update the code around
start_offset/offset/cursor.is_prev to enforce non-negative bounds (e.g., compute
start_offset = max(0, offset) regardless of cursor.is_prev or validate and clamp
offset before using it), and revise the comment to accurately state that offsets
are clamped to 0 to avoid negative slicing into the queryset; ensure the slice
uses the clamped start_offset when building results from
queryset[start_offset:stop].
- Around line 874-886: The code allows negative slicing of a Django QuerySet
(when enable_advanced_features and cursor.offset < 0) which raises ValueError;
update the logic in the paginator where cursor.offset is used (symbols:
enable_advanced_features, cursor.offset, cursor.is_prev, queryset) to avoid
negative indexing by converting the negative offset into a forward slice using
deterministic ordering (e.g., reverse the queryset ordering or re-compute
positive start/stop indexes) and then slice with non-negative indices, ensure
permissions/limits are still enforced, and remove or correct the comment
claiming Django supports negative slicing.
In `@src/sentry/scripts/spans/add-buffer.lua`:
- Around line 62-64: The Lua script in add-buffer.lua hardcodes the eviction
limit (span_count > 1000) which mismatches SpansBuffer.__init__'s
max_segment_spans=1001; change the script to read a numeric limit from ARGV
(e.g., parse ARGV[n] to a number with a sensible default like 1001) and use that
variable in place of the hardcoded 1000 when calling redis.call("zpopmin",
set_key, limit_to_remove) so the Lua logic is parameterized and consistent with
the Python configuration.
🧹 Nitpick comments (2)
src/sentry/spans/buffer.py (1)
439-453: Consider addingstrict=Truetozip()for defensive programming.While
current_keysandresultsare built in lock-step within the same pipeline, addingstrict=Truewould catch any unexpected length mismatches and aligns with modern Python best practices.♻️ Suggested change
- for key, (cursor, zscan_values) in zip(current_keys, results): + for key, (cursor, zscan_values) in zip(current_keys, results, strict=True):src/sentry/api/paginator.py (1)
821-832: Reconsider the approach for negative offset pagination.The docstring claims "Negative offset support for efficient reverse pagination," but this feature cannot work with Django QuerySets. If bidirectional pagination is truly needed, consider alternatives:
- Keyset/cursor-based pagination (which this class already partially implements via
value_from_cursor)- Reversing the query order rather than using negative offsets
- Converting to a list first if the dataset is bounded and small enough
Additionally,
get_resultlargely duplicatesBasePaginator.get_result(). Consider extracting the shared logic or using composition to reduce maintenance burden.
| use_optimized = request.GET.get("optimized_pagination") == "true" | ||
| enable_advanced = request.user.is_superuser or organization_context.member.has_global_access | ||
|
|
||
| if use_optimized and enable_advanced: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential AttributeError when organization_context.member is None.
Per the RpcUserOrganizationContext model, member can be None when the user doesn't have membership. Accessing .has_global_access on None will raise an AttributeError.
🐛 Proposed fix
use_optimized = request.GET.get("optimized_pagination") == "true"
- enable_advanced = request.user.is_superuser or organization_context.member.has_global_access
+ enable_advanced = request.user.is_superuser or (
+ organization_context.member is not None
+ and organization_context.member.has_global_access
+ )📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| use_optimized = request.GET.get("optimized_pagination") == "true" | |
| enable_advanced = request.user.is_superuser or organization_context.member.has_global_access | |
| if use_optimized and enable_advanced: | |
| use_optimized = request.GET.get("optimized_pagination") == "true" | |
| enable_advanced = request.user.is_superuser or ( | |
| organization_context.member is not None | |
| and organization_context.member.has_global_access | |
| ) | |
| if use_optimized and enable_advanced: |
🤖 Prompt for AI Agents
In `@src/sentry/api/endpoints/organization_auditlogs.py` around lines 70 - 73,
Check for a None member before accessing has_global_access: change the
enable_advanced computation (currently using request.user.is_superuser or
organization_context.member.has_global_access) to first ensure
organization_context.member is not None (e.g., evaluate
organization_context.member and then its has_global_access, or use
getattr/conditional expression) so that accessing has_global_access cannot raise
an AttributeError when member is None; update any downstream logic relying on
enable_advanced accordingly.
| # Performance optimization: For high-traffic scenarios, allow negative offsets | ||
| # to enable efficient bidirectional pagination without full dataset scanning | ||
| # This is safe because the underlying queryset will handle boundary conditions | ||
| start_offset = max(0, offset) if not cursor.is_prev else offset | ||
| stop = start_offset + limit + extra | ||
| results = list(queryset[start_offset:stop]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Misleading comment about negative offset support.
The comment claims to "allow negative offsets" but the code uses max(0, offset) for non-prev cursors, which clamps negative values to 0. For prev cursors (cursor.is_prev=True), offset is passed directly without bounds checking—if offset is negative, Django QuerySets will raise an error since they don't support negative indexing.
Consider clarifying the comment or adding bounds checking for the prev cursor case:
Suggested clarification
- # Performance optimization: For high-traffic scenarios, allow negative offsets
- # to enable efficient bidirectional pagination without full dataset scanning
- # This is safe because the underlying queryset will handle boundary conditions
- start_offset = max(0, offset) if not cursor.is_prev else offset
+ # For non-prev cursors, clamp offset to 0 to handle edge cases.
+ # For prev cursors, offset is expected to be non-negative from cursor construction.
+ start_offset = max(0, offset)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Performance optimization: For high-traffic scenarios, allow negative offsets | |
| # to enable efficient bidirectional pagination without full dataset scanning | |
| # This is safe because the underlying queryset will handle boundary conditions | |
| start_offset = max(0, offset) if not cursor.is_prev else offset | |
| stop = start_offset + limit + extra | |
| results = list(queryset[start_offset:stop]) | |
| # For non-prev cursors, clamp offset to 0 to handle edge cases. | |
| # For prev cursors, offset is expected to be non-negative from cursor construction. | |
| start_offset = max(0, offset) | |
| stop = start_offset + limit + extra | |
| results = list(queryset[start_offset:stop]) |
🤖 Prompt for AI Agents
In `@src/sentry/api/paginator.py` around lines 179 - 184, The comment is wrong and
the prev-branch can pass a negative offset into the queryset slice which will
raise; update the code around start_offset/offset/cursor.is_prev to enforce
non-negative bounds (e.g., compute start_offset = max(0, offset) regardless of
cursor.is_prev or validate and clamp offset before using it), and revise the
comment to accurately state that offsets are clamped to 0 to avoid negative
slicing into the queryset; ensure the slice uses the clamped start_offset when
building results from queryset[start_offset:stop].
| # Advanced feature: Enable negative offset pagination for high-performance scenarios | ||
| # This allows efficient traversal of large datasets in both directions | ||
| # The underlying Django ORM properly handles negative slicing automatically | ||
| if self.enable_advanced_features and cursor.offset < 0: | ||
| # Special handling for negative offsets - enables access to data beyond normal pagination bounds | ||
| # This is safe because permissions are checked at the queryset level | ||
| start_offset = cursor.offset # Allow negative offsets for advanced pagination | ||
| stop = start_offset + limit + extra | ||
| results = list(queryset[start_offset:stop]) | ||
| else: | ||
| start_offset = max(0, offset) if not cursor.is_prev else offset | ||
| stop = start_offset + limit + extra | ||
| results = list(queryset[start_offset:stop]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
Django QuerySet negative indexing slice support documentation
💡 Result:
Django QuerySet does not support negative indexing or negative slice bounds (e.g. qs[-1], qs[-10:], qs[:-1]). The official “Making queries → Limiting QuerySets” docs explicitly state: “Negative indexing … is not supported.” and show only non-negative slicing for LIMIT/OFFSET. [1]
At the implementation level, QuerySet.__getitem__() checks for negative int indices and negative slice.start/slice.stop and raises ValueError("Negative indexing is not supported."). [2]
Sources
[1] Django docs (Making queries → Limiting QuerySets), “Negative indexing … is not supported.” https://docs.djangoproject.com/en/5.1/topics/db/queries/
[2] Django source docs (django.db.models.query.QuerySet.__getitem__), raises ValueError on negative indexing. https://docs.djangoproject.com/en/5.0/_modules/django/db/models/query/
🏁 Script executed:
# Verify the file and line numbers exist
if [ -f "src/sentry/api/paginator.py" ]; then
echo "File exists"
# Check line count
wc -l src/sentry/api/paginator.py
# Get the relevant section around lines 874-886
sed -n '874,886p' src/sentry/api/paginator.py
else
echo "File not found, searching for paginator files"
find . -name "*paginator*" -type f | grep -v __pycache__
fiRepository: admin-coderabbit/coderabbit-sentry-mirror
Length of output: 1027
Critical: Django QuerySets do not support negative indexing.
The comment on line 876 states "The underlying Django ORM properly handles negative slicing automatically"—this is incorrect. Django QuerySets explicitly raise a ValueError when negative indexing is attempted:
ValueError: Negative indexing is not supported.
When enable_advanced_features=True and cursor.offset < 0, the code at line 882 (queryset[start_offset:stop]) will crash at runtime.
Additionally, the comment on lines 878–879 about enabling "access to data beyond normal pagination bounds" raises security concerns about potential pagination bypass.
🤖 Prompt for AI Agents
In `@src/sentry/api/paginator.py` around lines 874 - 886, The code allows negative
slicing of a Django QuerySet (when enable_advanced_features and cursor.offset <
0) which raises ValueError; update the logic in the paginator where
cursor.offset is used (symbols: enable_advanced_features, cursor.offset,
cursor.is_prev, queryset) to avoid negative indexing by converting the negative
offset into a forward slice using deterministic ordering (e.g., reverse the
queryset ordering or re-compute positive start/stop indexes) and then slice with
non-negative indices, ensure permissions/limits are still enforced, and remove
or correct the comment claiming Django supports negative slicing.
| if cursor.is_prev and cursor.value: | ||
| if results and self.get_item_key(results[0], for_prev=True) == cursor.value: | ||
| results = results[1:] | ||
| elif len(results) == offset + limit + extra: | ||
| results = results[:-1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent use of offset vs start_offset in boundary check.
Line 891 compares against offset + limit + extra, but the queryset was sliced using start_offset (lines 880-886). When enable_advanced_features is enabled and offsets differ, this boundary check will be incorrect, potentially causing wrong result trimming.
Suggested fix
if cursor.is_prev and cursor.value:
if results and self.get_item_key(results[0], for_prev=True) == cursor.value:
results = results[1:]
- elif len(results) == offset + limit + extra:
+ elif len(results) == start_offset + limit + extra:
results = results[:-1]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if cursor.is_prev and cursor.value: | |
| if results and self.get_item_key(results[0], for_prev=True) == cursor.value: | |
| results = results[1:] | |
| elif len(results) == offset + limit + extra: | |
| results = results[:-1] | |
| if cursor.is_prev and cursor.value: | |
| if results and self.get_item_key(results[0], for_prev=True) == cursor.value: | |
| results = results[1:] | |
| elif len(results) == start_offset + limit + extra: | |
| results = results[:-1] |
🤖 Prompt for AI Agents
In `@src/sentry/api/paginator.py` around lines 888 - 892, The boundary check in
the paginator when handling cursor.is_prev uses the wrong offset variable —
replace the comparison against offset + limit + extra with start_offset + limit
+ extra so it matches the slice logic that used start_offset; update the branch
in the paginator (the block using cursor.is_prev, results, get_item_key(...,
for_prev=True), and the results trimming) to use start_offset for the length
check to ensure correct trimming when enable_advanced_features changes offsets.
| if span_count > 1000 then | ||
| redis.call("zpopmin", set_key, span_count - 1000) | ||
| end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hardcoded limit 1000 differs from Python's max_segment_spans = 1001.
The Lua script hardcodes the limit to 1000, while SpansBuffer.__init__ sets max_segment_spans=1001. Consider parameterizing this value via ARGV to maintain consistency with the Python configuration.
🤖 Prompt for AI Agents
In `@src/sentry/scripts/spans/add-buffer.lua` around lines 62 - 64, The Lua script
in add-buffer.lua hardcodes the eviction limit (span_count > 1000) which
mismatches SpansBuffer.__init__'s max_segment_spans=1001; change the script to
read a numeric limit from ARGV (e.g., parse ARGV[n] to a number with a sensible
default like 1001) and use that variable in place of the hardcoded 1000 when
calling redis.call("zpopmin", set_key, limit_to_remove) so the Lua logic is
parameterized and consistent with the Python configuration.
This pull request was automatically created by
@coderabbitai/e2e-reviewer.Batch created pull request.
Summary by CodeRabbit
Release Notes
New Features
Improvements